Hui Tang, Purdue University,tang227@purdue.edu, Primary
Chao Pan, Purdue Unviersity, panc@purdue.edu
Bing Yu, Purdue University, yu245@purdue.edu
Weidan Du, Purdue University,du97@purdue.edu
Shuang Wei, Purdue University, wei93@purdue.edu
Mingran Li, Purdue Unviersity,li1940@purdue.edu
Chen Guo, Purdue University, guo171@purdue.edu
Longjie Cheng, Purdue University, cheng70@purdue.edu
Kai Hu, Purdue Univerisity, hu332@purdue.edu
Rongrong Zhang, Purdue Univerisity, zhan1602@purdue.edu
XinZhe Li, HIT University
Dr. Yingjie (Victor) Chen, Computer Graphics Technology, Purdue University, victorchen@purdue.edu (supervising faculty)
Dr. Zhenyu (Cheryl) Qian, Interaction Design, Purdue University, qianz@purdue.edu (supervising faculty)
Dr. Yu (Michael) Zhu, Statistics, Purdue University, yuzhu@stat.purdue.edu (supervising faculty)
Student Team: YES
Did you use data from both mini-challenges? No
CanvasJS, Gephi
Approximately how many hours were spent working on this submission in total?
May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2015 is complete?
Yes
Video Download
Video:
https://va.tech.purdue.edu/vast2015/MC2Video/MC2Video.wmv
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Questions
MC2.1 – Identify those IDs that stand out for their large volumes of communication. For each of these IDs
a. Characterize the communication patterns you see.
b. Based on these patterns, what do you hypothesize about these IDs?
Limit your response to no more than 4 images and 300 words.
a.
ID 1278894 has the largest volume of communication among all IDs (total 189,894 messages sent). For all of these three days, it sends messages periodically in Entry Corridor. As can be seen in figure 1, the message sending always starts after 11:00 AM. There are five sending periods each day with 1 hour-rest between them. Within each period, ID 1278894 sends almost the same number of messages every 5 minutes, and also receives a few messages.
However, the coverage of the sending messages is different. During the first sending period on each day after 11:00 AM, it has the largest coverage. However, the coverage during other time periods is much smaller.
ID 839736 is also a big source of messages (total 60812). Different from id 1278894, ID 839736 sends and receives messages continuously on Friday and Saturday in Entry Corridor. As can be seen from Figure2, the number of messages that it sends for every minute is almost the same with the number of messages it receives and both of them don’t change so much. The pattern is almost the same on Sunday, but at 12:00, there is a huge burst of message sending and receiving from this ID and another smaller one at 14:40.
b.
ID 1278894 might be a broadcast center related to some activities or shows in the park. Because it has a fixed location and sends a large number of messages periodically.
ID 839736 might be an information service that answer questions so the amount of message sending and receiving is almost the same. But from 12:00 to 14:40, the increasing of the amount of communication might because of some inquiries of accidents.
Figure 0. Overall Structure
Figure 1.
Figure 2.
Figure 3.
MC2.2 – Describe up to 10 communications patterns in the data. Characterize who is communicating, with whom, when and where. If you have more than 10 patterns to report, please prioritize those patterns that are most likely to relate to the crime.
Limit your response to no more than 10 images and 1000 words.
2.Though there is no id who didn't send a single message at all, there are 14 ids who sent only one message over three days. The following table summarizes the time and location of communication for those ids.
time |
Fromid |
toid |
location |
fromx |
fromy |
tox |
toy |
6/6/14 9:35:57 |
1763672 |
0 |
Kiddie Land |
81 |
77 |
0 |
0 |
6/6/14 14:47:15 |
1336870 |
0 |
Wet Land |
22 |
34 |
0 |
0 |
6/7/14 9:32:47 |
596672 |
0 |
Tundra Land |
41 |
75 |
0 |
0 |
6/7/14 10:38:23 |
1458915 |
0 |
Wet Land |
32 |
33 |
0 |
0 |
6/7/14 11:27:30 |
1680161 |
0 |
Kiddie Land |
87 |
48 |
0 |
0 |
6/8/14 9:38:58 |
215220 |
0 |
Kiddie Land |
80 |
74 |
0 |
0 |
6/8/14 11:19:35 |
365259 |
0 |
Wet Land |
69 |
44 |
0 |
0 |
6/8/14 11:52:54 |
474843 |
0 |
Wet Land |
69 |
44 |
0 |
0 |
6/8/14 12:26:03 |
688489 |
0 |
Wet Land |
63 |
43 |
0 |
0 |
6/7/14 8:51:15 |
1187304 |
839736 |
Wet Land |
56 |
31 |
0 |
0 |
6/7/14 13:41:21 |
1038617 |
839736 |
Tundra Land |
35 |
65 |
0 |
0 |
6/8/14 9:38:29 |
825934 |
839736 |
Tundra Land |
49 |
83 |
0 |
0 |
6/8/14 10:16:51 |
1658667 |
839736 |
Wet Land |
16 |
49 |
0 |
0 |
6/8/14 13:07:38 |
906235 |
839736 |
Tundra Land |
16 |
66 |
0 |
0 |
It can be seen from the table above that the target for those who sent only one message is either an external entity or the broadcasting id 839736.
3.There are hub groups where within each there is a center id who communicated with everyone while no one else communicated with each other. The following graph shows an example happened on Sunday between 11:25:49.
The communication detail is shown in the following table:
time |
fromid |
toid |
location |
fromx |
fromy |
tox |
toy |
6/8/14 11:25:49 |
897528 |
1240560 |
Wet Land |
62 |
42 |
42 |
20 |
6/8/14 11:25:49 |
897528 |
509717 |
Wet Land |
62 |
42 |
87 |
68 |
6/8/14 11:25:49 |
897528 |
1078759 |
Wet Land |
62 |
42 |
42 |
20 |
6/8/14 11:25:49 |
897528 |
1861415 |
Wet Land |
62 |
42 |
62 |
41 |
6/8/14 11:25:49 |
897528 |
1445101 |
Wet Land |
62 |
42 |
42 |
20 |
6/8/14 11:25:49 |
897528 |
457576 |
Wet Land |
62 |
42 |
62 |
43 |
4.There are hubs with two centers where each center id sent messages to the rest of the group. However, non-center ids didn’t communicate with each other. An example is shown in the following graph:
The communication detail is shown in the following table:
time |
fromid |
toid |
location |
fromx |
fromy |
tox |
toy |
2014-06-08 11:26:10 |
51523 |
648000 |
Wet Land |
17 |
43 |
17 |
43 |
2014-06-08 11:26:10 |
51523 |
1606701 |
Wet Land |
17 |
43 |
17 |
43 |
2014-06-08 11:26:10 |
51523 |
730487 |
Wet Land |
17 |
43 |
16 |
66 |
2014-06-08 11:26:10 |
51523 |
1592785 |
Wet Land |
17 |
43 |
16 |
66 |
2014-06-08 11:26:10 |
51523 |
1095102 |
Wet Land |
17 |
43 |
16 |
66 |
2014-06-08 11:26:10 |
51523 |
1729991 |
Wet Land |
17 |
43 |
17 |
43 |
2014-06-08 11:26:10 |
51523 |
1753569 |
Wet Land |
17 |
43 |
16 |
66 |
2014-06-08 11:26:37 |
1729991 |
51523 |
Wet Land |
17 |
43 |
17 |
43 |
2014-06-08 11:26:37 |
1729991 |
648000 |
Wet Land |
17 |
43 |
17 |
43 |
2014-06-08 11:26:37 |
1729991 |
1606701 |
Wet Land |
17 |
43 |
17 |
43 |
2014-06-08 11:26:37 |
1729991 |
730487 |
Wet Land |
17 |
43 |
16 |
66 |
2014-06-08 11:26:37 |
1729991 |
1592785 |
Wet Land |
17 |
43 |
16 |
66 |
2014-06-08 11:26:37 |
1729991 |
1095102 |
Wet Land |
17 |
43 |
16 |
66 |
2014-06-08 11:26:37 |
1729991 |
1753569 |
Wet Land |
17 |
43 |
16 |
66 |
It is clear from the data that each of the two center ids sent messages at different time. All of the communications occurred in Wet Land. Besides, it can be seen that both centered ids were at the same location while sending messages. id 648000 and id 1606701 were with the centered ids during this period. The other non-center ids were away from the centers but they are together by themselves.
5.There are some cases where two hubs are connected through a middleman. The following graph shows an example:
where id 1377155 and id 966510 are the two centers in a hub and id 0 is a pseudo id representing an external entity. They are inter-connected through id 254060. In this topology, the two center ids sent messages to the middleman which directed messages to id 0.
The following table shows the communication details:
time |
fromid |
toid |
location |
fromx |
fromy |
tox |
toy |
2014-06-08 11:25:23 |
1337155 |
254060 |
Tundra Land |
34 |
65 |
33 |
65 |
2014-06-08 11:26:22 |
966510 |
254060 |
Tundra Land |
36 |
72 |
36 |
72 |
2014-06-08 11:26:30 |
254060 |
0 |
Tundra Land |
37 |
73 |
0 |
0 |
It can be seen that the communication occurred in Tundra Land. Id 1337155 contacted the middle man id 254060 before id 966510 did. While one of the centers id 966510 and the middleman id 254060 are close, the other center id 1337155 was further from them.
6.There are large multi-center hubs which are connected to other groups through multiple middlemen.
Id 1215994, 675346, 376904, 829943, 162071, 1460660, 1952914, 170456, and 1034802 are the major communicators in this group. Besides, there are two types of members in this kind of group, one only communicates with those ids listed, the other communicates with the listed ids and other groups as well. The former are indicated by the dots enclosed in the middle, and the latter type is represented by the dots aligned on the bottom.
The following table shows the time when each of those listed ids sent messages. Each of them sent to multiple ids:
Time |
From |
Location |
2014-06-08 11:25:08 |
1215994 |
Coaster Alley |
2014-06-08 11:25:15 |
170456 |
Entry Corridor |
2014-06-08 11:25:17 |
1034802 |
Coaster Alley |
2014-06-08 11:25:24 |
1952914 |
Wet Land |
2014-06-08 11:26:50 |
1620771 |
Coaster Alley |
2014-06-08 11:26:53 |
829943 |
Tundra Land |
2014-06-08 11:26:54 |
675346 |
Wet Land |
2014-06-08 11:26:54 |
1460660 |
Coaster Alley |
2014-06-08 11:26:39 |
376904 |
Coaster Alley |
It is interesting to notice that each id was in the same Land while sending messages. It is suspected that they are the local receptionists in each Land.
7.There are some instances where a member who belongs to a large group “leaks” messages to an individual who has no connection with the rest of the group. The following is an example:
Id 1022772 belongs to a large group where heavy communication occurred on Sunday between 12:25:00 and 12:26:59. However, id 584017 has no other communication except with 1022772.
time |
fromid |
toid |
location |
fromx |
fromy |
tox |
toy |
2014-06-08 11:24:08 |
1366526 |
584017 |
Coaster Alley |
49 |
28 |
38 |
90 |
2014-06-08 11:24:28 |
1366526 |
584017 |
Coaster Alley |
52 |
28 |
38 |
90 |
2014-06-08 11:26:22 |
1022772 |
584017 |
Coaster Alley |
44 |
25 |
38 |
90 |
2014-06-08 11:27:40 |
1859126 |
584017 |
Wet Land |
17 |
43 |
38 |
90 |
2014-06-08 11:28:09 |
1063010 |
584017 |
Wet Land |
69 |
44 |
38 |
90 |
2014-06-08 11:28:10 |
1394108 |
584017 |
Wet Land |
69 |
44 |
38 |
90 |
2014-06-08 11:28:16 |
715390 |
584017 |
Wet Land |
15 |
40 |
38 |
90 |
2014-06-08 11:28:18 |
18989 |
584017 |
Wet Land |
17 |
43 |
38 |
90 |
2014-06-08 11:28:27 |
640904 |
584017 |
Kiddie Land |
71 |
81 |
38 |
90 |
2014-06-08 11:28:28 |
640904 |
584017 |
Kiddie Land |
71 |
81 |
38 |
90 |
2014-06-08 11:28:30 |
967171 |
584017 |
Wet Land |
17 |
43 |
38 |
90 |
The above table shows that id 1022772 sent a message to id 584017 on Sunday 11:26:22 at Coaster Alley.
8.There are occasions when the broadcasting sites sent massive messages. The following is an example where massive messages sent from id 1278894 on Sunday 12:30:00 and 12:30:01.
MC2.3 – From this data, can you hypothesize when
the crime was discovered? Describe your
rationale.
Limit your response to no more than 3 images and 300 words.
From this data, we can hypothesize that the crime was discovered at 12:00 on Sunday.
Firstly, because ID 839736 is considered as information service center
(see our hypothesis in MC2.1), the communication pattern of it can
represents the whole situation to some extent. As can be seen in Figure
1, there is a burst of message sending and receiving at 12:00 on Sunday
which is different from it on Friday and Saturday. We think that many
people are asking for more information at that time. So this unusual
behavior can be an indication of when the crime was discovered.
Secondly, for other IDs, the pattern is different between Sunday and
Saturday at 12:00. This can also support our hypothesis. For example,
comparing figure2 with figure3, we can see that most people in that
group send much more messages during 12:00 - 12:30 than other time
periods on Sunday and the whole Saturday.
Figure1. ID 839736, Sun(162755)
Figure2. Sat(403)
Figure3. Sun(403)